[ET-VK][qconv] Add flexible layout impl for quantized pointwise conv by pytorchbot · Pull Request #17267 · pytorch/executorch

pytorchbot · 2026-02-05T23:29:13Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #17221 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/410/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/410/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/409/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/410/orig
Differential Revision: D92307253
@diff-train-skip-merge

pytorch-bot · 2026-02-05T23:29:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17267

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Pull Request resolved: #17221 This commit adds a flexible memory layout implementation for quantized pointwise (1x1) convolution in the ExecuTorch Vulkan backend. The key changes introduce a new operator (etvk.q8ta_conv2d_pw) that can handle multiple int8 tensor memory layouts, rather than being restricted to a single fixed layout. Key Components Added 1. Two New GLSL Compute Shaders - q8ta_conv2d_pw.glsl: The primary flexible-layout shader that uses BufferMetadata UBOs and layout specialization constants to support multiple memory layouts (kPackedInt8_4C1W, kPackedInt8_4W4C, kPackedInt8_4C). Uses scalar array indexing for output writes to handle different stride patterns. - q8ta_conv2d_pw_4w4c_ref.glsl: A reference implementation specifically for 4W4C layout that uses simpler ivec4 indexing. Currently not enabled in production (gated by if (false) in C++). Both shaders use: - 4×8 output tiling (TILE_M=4 widths × TILE_N=8 channels per thread) - dotPacked4x8AccSatEXT for efficient int8 dot products - Texture2D for weight storage, buffers for input/output - Per-channel weight quantization with symmetric int8 weights 2. C++ Operator Implementation (Q8taConv2dPW.cpp) - prepack_quantized_conv2d_pw_weight(): Prepacks int8 weights into texture2D format optimized for the shader's access pattern - add_q8ta_conv2d_pw_node(): Dispatches the flexible-layout shader with buffer metadata UBOs - add_q8ta_conv2d_pw_4w4c_node(): Dispatches the 4W4C-specific reference shader - q8ta_conv2d_pw(): High-level operator that handles argument parsing, weight prepacking, and kernel selection 3. Test Infrastructure Updates - TestQ8taConv2d.cpp: Added test_q8ta_conv2d_pw() test operator that wraps quantize → conv2d_pw → dequantize for end-to-end testing - test_q8ta_conv2d_pw.cpp: Comprehensive test suite with: - Multiple input sizes (3→32, 32→64, 64→96, 7→13, 40→80 channels, etc.) - Performance test cases (480→160, 48→22, 128→128, 576→64 channels) - Tests across 3 memory layouts: kPackedInt8_4C1W, kPackedInt8_4W4C, kPackedInt8_4C - Both texture and buffer storage types for floating-point tensors - Reference implementation comparison for correctness validation Architecture The shader handles layout flexibility via: 1. Layout specialization constants (outp_layout, inp_layout) passed from C++ 2. BufferMetadata UBOs providing runtime strides for input/output tensors 3. compute_outp_buffer_idx() function that computes correct buffer indices based on layout 4. get_outer_packed_dim_block_size() from block_indexing.glslh to determine stride patterns ghstack-source-id: 338638556 @exported-using-ghexport Differential Revision: [D92307253](https://our.internmc.facebook.com/intern/diff/D92307253/)

github-actions · 2026-02-06T00:40:05Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

pytorchbot requested a review from SS-JIA as a code owner February 5, 2026 23:29

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2026

SS-JIA force-pushed the gh/SS-JIA/409/orig branch from 059231f to c16aaf5 Compare February 6, 2026 00:38

SS-JIA requested review from kirklandsign and larryliu0820 as code owners February 6, 2026 00:38

Base automatically changed from gh/SS-JIA/409/orig to main February 6, 2026 00:38

SS-JIA force-pushed the gh/SS-JIA/410/orig branch from 6876097 to 3eb0bd4 Compare February 6, 2026 00:39

SS-JIA approved these changes Feb 6, 2026

View reviewed changes

SS-JIA merged commit 4bd58bb into main Feb 6, 2026
29 of 30 checks passed

SS-JIA deleted the gh/SS-JIA/410/orig branch February 6, 2026 00:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][qconv] Add flexible layout impl for quantized pointwise conv#17267

[ET-VK][qconv] Add flexible layout impl for quantized pointwise conv#17267
SS-JIA merged 1 commit intomainfrom
gh/SS-JIA/410/orig

pytorchbot commented Feb 5, 2026

Uh oh!

pytorch-bot bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pytorchbot commented Feb 5, 2026

Uh oh!

pytorch-bot bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17267

Uh oh!

Uh oh!

github-actions bot commented Feb 6, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Feb 5, 2026 •

edited

Loading

This PR needs a `release notes:` label